Wild Card Week: Artificial Intelligence in Vision Recognition

Artificial Intelligence (AI) refers to the capability of a machine to imitate intelligent human behavior. When talking about Vision recognition, AI allows computers to interpret and understand the visual world. Using digital images from cameras and videos, AI systems can identify and classify objects, recognize patterns, and make decisions. This technology is used in various applications, from facial recognition in security systems to analyzing medical images for diagnostics, which will be the use for this project.

1. How AI Understands Images: The Basics

AI systems learn to recognize images through a process similar to how humans learn from experience. The core of this process involves training an AI model using large sets of images. This training involves feeding the AI system examples of images that are already labeled, like pictures of cats labeled as "cat" so the system learns to associate the visual patterns with the label. Over time, the AI learns to recognize these patterns and can identify and classify new images it has never seen before.

2. Types of AI Models for Vision Recognition

There are several types of AI models that can be used for vision recognition, including:

  • Convolutional Neural Networks (CNNs): These are specifically designed for processing pixel data, and are used extensively in image and video recognition.
  • Recurrent Neural Networks (RNNs): Best for sequential data like video where the current frame depends on the previous ones.
  • Autoencoders: Used for tasks like image denoising and dimensionality reduction to help improve the quality and efficiency of image recognition.

3. Implementing a Vision Recognition Project

A vision recognition project typically involves several key phases:

  • Data Collection: Gathering a diverse set of images that represent all the variations you expect the AI to handle.
  • Model Training: Using the collected data to train your AI model. This involves adjusting the model’s parameters so it can correctly identify and classify all different inputs.
  • Testing and Tuning: After training, the model is tested with new images it hasn't seen before. Based on the results, adjustments might be made to improve accuracy.
  • Deployment: Once the model performs well, it’s deployed in real-world applications, where it can provide insights or automation based on visual data.

Teachable Machine

Teachable Machine from Google is an easy-to-use web tool that allows users to create machine learning models without any coding experience. It's designed to make AI accessible by enabling anyone to train and deploy models directly from their web browser.
For my week I decided to take a biomedical approach and download a dataset of Brain MRI Images for Brain Tumor Detection and have the machine learn to classify a Tumorous brain MRI and a non-tumorous brain MRI. However there is a BIG DISCLAIMER, this is a academic driven project and is not intended for real medical, diagnose or treatment use.

How Teachable Machine Works
  • Choose a Model Type: Users can start by selecting the type of model they want to create, which includes image recognition, sound recognition, or pose detection.
    Teachable Machine Interface
  • Gather and Upload Data:upload examples directly into the tool. For this particular instance, in an image recognition project, We select all the images of our Class 1 which will be “Brain Tumor” and upload them and repeat this step for the Class 2 “No brain tumor”. It is important to note that from the original set, 10 random images of each class where taken out to make sure the training images will not be the same as the test images.
    Teachable Machine Data Upload
  • Train the Model: Teachable Machine uses the uploaded data to train the model. This process is automated and visualized on the screen, so we can see the model learn in real-time.
    Teachable Machine Training
  • Test the Model: Once training is complete, we proceed to test the model by inputting new data to see how well it performs. This helps in understanding if the model needs more examples or adjustments. For our model, the learning seems to be really food and would not need further examples.
  • Export or Deploy: After testing, the model can be exported to various formats or deployed directly within websites or applications. Teachable Machine provides easy integration options, including links and downloadable files. Thanks to this, you can use our AI right HERE!
How could I use this in the future?

After this week, I learned a lot about how AI operates, particularly in the image recognition. This knowledge has significantly broadened my perspective and gave me the skills needed for a future improvement for my final project. For this project, I plan to utilize a camera to provide a live feed of the patient's footprint. By integrating AI technology, I can potentially assist physicians in diagnosing various foot health and posture-related pathologies, such as flatfoot or hollow foot. The AI's capabilities in analyzing and recognizing patterns in the live feed could provide valuable insights that would otherwise be challenging to detect.
However, it is essential to emphasize that AI is a powerful tool that complements, rather than replaces, the expertise of physicians. While the AI can process and analyze the data, the ultimate diagnosis and treatment decisions rest with the healthcare professional. The physician will consider the AI's findings along with the patient's comprehensive clinical information to determine the best possible treatment for each pathology.

Files